A new power G-family of distributions: Properties, estimation, and applications

This article suggests a new method to expand a family of life distributions by adding a parameter to the family, increasing its flexibility. It is called the extended Modi-G family of distributions. We derived the general statistical properties of the proposed family. Different methods of estimation were presented to estimate the parameters for the proposed family, such as maximum likelihood, ordinary least square, weighted least square, Anderson Darling, right-tailed Anderson-Darling, Cramér-von Mises, and maximum product of spacing methods. A special sub-model with three parameters called extended Modi exponential distribution was derived along with different shapes of its density and hazard functions. Randomly generated data sets and different estimation methods were used to illustrate the behavior of parameters of the proposal sub-model. To illustrate the importance of the proposed family over the other well-known methods, applications to medicine and geology data sets were analyzed.


Introduction
In probability distribution theory, choosing a particular probability distribution for modeling real-life phenomena could depend on whether the distribution is flexible.The tractability of a probability distribution might be helpful in theory because such distribution would be easy to work with, particularly with regard randomly generated data of different random samples.Still, the flexibility of probability distributions could be of interest to experts.It is preferable to use probability distributions that best fit the available data set rather than transform the existing one.Therefore, many attempts have been made lately to guarantee that the standard theoretical distributions are changed and developed.This could build their adaptability and increase their capacity to model real-life data sets.
Different methods could be put into use to expand the current standard distribution.For example, the flexibility of a distribution can be increased through generalization, which involves using the accessible generalized family of distributions.When a distribution is generalized, extra shape parameter(s) from the family of distributions utilized would have been added.The job of these additional shape parameter(s) is to change the tail weight of the resulting compound distribution, thereby inducing it with skewness.Generalizing classical distributions is an ancient practice, as important as many other practical problems in statistics.These generalizations introduced additional location, scale, or shape parameters to the original model.This branch of statistics has received considerable attention.Many general distribution classes have been derived in recent years.Azzalini [1] introduced the skew-normal distribution by adding an extra parameter to the normal distribution to increase flexibility.Mudholkar and Srivastava [2] proposed a technique for adding an extra parameter to a two-parameter Weibull distribution.Marshall and Olkin [3] introduced another method for adding a parameter to expand a family of distributions.Eugene et al. [4] introduced the beta generalized family of distributions.It was derived from the log-it of the beta random variable, and it has two extra shape parameters.Cordeiro and de Castro [5] created and studied another family of generalized distributions dependent on the Kumaraswamy distribution.Zografos and Balakrishnan [6] presented a gamma generalized family of distributions.Type II of gamma generalized family of distributions presented by Ristic and Balakrishnan [7].The McDonald generalized family of distribution was generated from the McDonald random variables by Alexander et al. [8].Alzaatreh et al. [9] presented a new method of generating families of continuous distributions called the T-X family.Mahdavi and Kundu [10] introduced a new method for deriving statistical distributions.For more surveys about methods of generating distributions, see Lee et al. [11], Jones [12], and Ahmad et al. [13].
A more adaptable distribution that works well in various circumstances is still required, even though several distributions in the literature may be used to evaluate data in many domains.The main aim of this paper is to propose a new family of generalized distributions.The family's many mathematical properties are investigated, and several methods are used for parameter estimates.A new statistical model is derived using the exponential distribution as a baseline for our proposed family.The exponential distribution's additional parameters changed the tail length and created uneven densities to the right and left.The behavior of new model estimators was examined via simulation, and all estimation methods showed the consistency property for all measures.Three real data sets were used to demonstrate the suggested model's applicability compared to the other models.
The article is prepared as follows: The proposed family is presented in Section 2, and Section 3 demonstrates its statistical features, including the quantile function, moments, the moment generating function, incomplete moments, and the Re ´nyi entropy.The maximum likelihood estimation (MLE), ordinary Least-Squares estimation (OLSE), weighted-least Squares estimation (WLSE), Anderson-Darling estimation (ADE), right-tailed Anderson-Darling estimation (RTADE), Crame ´r-von Mises estimation (CME) and maximum product of spacing (MPSE) techniques are described in Section 4 for estimating the proposed family parameters.Section 5 defines the extended Modi exponential distribution (ExMED).Estimating the ExMED parameters using simulation results in section 6.In Section 7, three real data sets illustrate the performance of the ExMED distribution.Finally, some final thoughts are offered in Section 8.

Family formulation
Modi et al. [14] presented a new Modi family of probability distributions.Their CDF and PDF are given by respectively, where x, α, β > 0. They studied the statistical properties of Modi exponential distribution and applied them to two real data sets.This section presents a relatively new family of generalized distributions called the extended Modi-G family of distributions.Its

Mathematical properties
This section provides some mathematical properties of the extended Modi-G family of distributions, such as linear representation, quantile function, moments, moment-generating function, incomplete moments, and entropy.

Linear representation
A useful linear representation of CDF (1) and PDF ( 2) is introduced in this subsection.For −1

Quantile function
Let X be a random variable with CDF (1), then the quantile function (QF) of X is defined as the following where G −1 is the QF of the baseline distribution and u 2 (0, 1).By setting u = 0.25, 0.5, 0.75, we have first, second, and third quartiles, respectively, which are used to determine Bowley's skewness (BS) [15] as the following and Moor's kurtosis (MK) [16] as the following Qð6=8Þ À Qð2=8Þ : ð11Þ

Moments and moment generating function
Moments play an important role in statistical analysis, especially in applications.The r th moments of the extended Modi-G family of distribution are defined as the following x r f ðx; a; y; �Þdx; ð12Þ using (8) in (12), we have x r Gðx; �Þ ykÀ 1 gðx; �Þdx: ð13Þ The moment generating function (MGF) of extended Modi-G random variable X is given by x m Gðx; �Þ ykÀ 1 gðx; �Þdx; ð14Þ by replacing t by it, we have the characteristic function of the extended Modi-G family of distribution.

Incomplete moments
Let X be a random variable with PDF (2), then the r th incomplete moments of it is given as the following I r ðy; a; y; �Þ ¼ The first incomplete moment of X is given by by using Eq (15), we have Lorenz, Bonferroni, and Zenga curves [17], respectively, as the following where F(x p ; α, θ, ϕ) = p.Also, by using Eqs ( 1) and ( 15), we can determine the mean residual life (MRL) and the mean inactivity time (MIT), respectively, as the following FðtÞ :

Entropy
The entropy of a random variable X determines the randomness found in a probability distribution, and different types of entropies are not similarly useful for all applications.Let X be a continuous random variable with, then Re ´nyi entropy [18] is given by R r ðx; a; y; From PDF (2), we have f ðx; a; y; �Þ r ¼ ðyaÞ r ðaþ1Þ r Gðx;�Þ rðyÀ 1Þ gðx;�Þ r ðaþGðx;�Þ y Þ 2r ¼ y r ðaþ1Þ r Gðx;�Þ rðyÀ 1Þ gðx;�Þ r a r 1þ Gðx;�Þ y a

Maximum likelihood estimation
It is the most common method used for estimating unknown parameters (for more details, see [20]).Let X 1 , X 2 ,. ..X n be a random sample with PDF (2), then the log-likelihood function is given by Lðx; a; y; By derivative Eq (19) to its parameters and equating the result equations to zero will provide us with the requested estimates.These derivatives are determined as follows @ @a Lðx; a; y; The second derivative of each parameter is determined to construct the proposed model Hessian matrix as the following @ 2 @a 2 Lðx; a; y; Now, after determining the inverse of the Hessian matrix, we obtained the covariance matrix of our estimators.By calculating its diagonal square root, we obtained the standard errors of our estimators.

Ordinary Least-Squares and Weighted-Least Squares estimation
Let x 1:n , x 2:n , . .., x n:n be the order statistics of a random sample of size n from the extended Modi-G family of distributions, where estimates can be obtained by solving simultaneously the three non-linear equations obtained from minimizing Eq (20) to its parameters (for further detail, see [21]).Similarly, WLSE is determined by minimizing the following equation

Anderson-Darling and right-tail Anderson-Darling estimation
The ADE of unknown parameters of the extended Modi-G family of distributions are obtained by minimizing the following equation Similarly, the RTADE of parameters can be calculated by minimizing the following equation

Crame ´r-von Mises estimation
The CVME of unknown parameters of the extended Modi-G family of distributions are obtained by minimizing the following equation(for more details, see [22]) (Continued )

Maximum product of spacings estimation
The maximum product of spacings (MPSE) [23] method is used to estimate the parameters of continuous univariate models as an alternative to the ML method.The uniform spacings of a random sample of size n from the extended Modi-G family can be defined by where D i denotes to the uniform spacings, F(x 0 ) = 0, F(x n+1 = 1) and P nþ1 i¼1 D i ¼ 1. MPS estimators of parameters can be obtained by maximizing

A special sub-model
In this section, we defined a two-parameter sub-model of the proposed family by taking the CDF of the baseline distribution following the exponential distribution, which is called extended Modi exponential distribution (ExMED).Then, CDF and PDF of ExMED are given, respectively, by Its SF, HRZ, and RHRF are, respectively, given by the following relation Figs 1 and 2 display the PDF and HRF plots of ExMED, respectively.As these figures demonstrate, the ExMED can handle decreasing, decreasing-constant, increasing, and increasing constant hazard rate functions.In addition, some densities are symmetrical, left-skewed, rightskewed, J-shaped, and reversed-J-shaped.
The quantile function of ExMED is given by Let u * uniform (0, 1).Then, using the ExMED's QF, one may use the formula to produce random data sets of size n from this distribution.

Simulation results of estimation methods
We explore the performance of the aforementioned estimation methods in estimating the ExME parameters using simulation results.We consider various sample sizes, n = 20, 70, 100, 250, 1000, and various parametric values.We generate n = 1000 random samples from the ExMED and determine the average estimates (AESTs), the average absolute biases (ABs), average mean square error (AMSEs), and average mean relative estimates (AMREs) for all sample sizes and parameter combinations using the R software©.
The following respective equations can calculate the AESTs, ABs, AMSEs, and AMREs: where θ = (α, θ, b) 0 .The results of the simulation study, including AESTs, ABs, AMSEs, and AMREs, were reported in Tables 1-4.The row indicating ∑ Ranks gives the partial sum of the ranks of ABs, AMSEs, and AMREs.A superscript indicates the rank of each of the estimators among all the estimators for that metric.
The following observations can be drawn from Tables 1-4.
• All the estimators reveal the consistency property, i.e., the MSE decreases when the sample size increases.
• ABs of all estimates decrease when n increases for all estimation methods.
• AMREs of all estimates decrease when n increases for all estimation methods.
• In terms of the performance of the estimation methods, we found that the MPSE estimates are the best estimators as they produce the least biases, and MSE with the least MRE for most of the configurations considered in our study.The next best estimators are the MLE estimates, followed by the ADE.The overall positions of the estimators are presented in Table 5, from which we can confirm the superiority of MPSE.In summary, based on Table 5, the

Data analysis
In this section, we use three real data sets from the fields of medicine and geology to explain the superiority of the proposed model in fitting these data sets over other related models.The first data was set about the remission times (in months) of a random sample of 128 bladder cancer patients, which was introduced in Lee and Wang [24].The second data set consists of measurements made on patients with malignant melanoma.We compare the proposed distribution with some other well-known and related competing distributions, including Modi exponential distribution (MED) [14], modified kies exponential distribution (MKED) [27], alpha power exponential distribution (APED) [10], exponential distribution (ED), exponentiated exponential distribution (ExED), generalized log-logistic exponential distribution (GLLED) [28], linear exponential distribution (LNED) [29], logistic exponential distribution (LED) [30], Marshall Olkin exponential distribution (MOED) [3], Nadarajah Haghighi exponential distribution (NHED) [31], odd exponentiated half logistic exponential distribution (OExHLED) [32], odd inverse Pareto exponential distribution (OIPRED) [33], transmuted exponential distribution (TED) [29] and transmuted generalized exponential distribution (TGED) [34] distribution.
The comparison models can be compared using some discrimination measures such as the Akaike information criterion (AKIC), consistent Akaike information criterion (CAKIC), and   The MLEs and the analytical measures are computed using the Wolfram Mathematica version 12.0.Tables 9-14 give analytical measures along with the MLEs and their standard errors for the three data sets, respectively.The results in these tables indicate that the ExMED provides better fits than other competing models and could be chosen as an adequate model to analyze medicine (cancer) and geology (earthquakes) data sets.
The fitted PDF, CDF, SF, and probability-probability (P-P) plots of the ExMED for the three data sets are shown in Fig 4, respectively.Furthermore, we use the seven estimation approaches discussed in Section 4 to estimate the ExMED parameters.Table 15 reports the estimates of the ExMED parameters using these approaches and the numerical values of estimated parameters and negative log-likelihood along with goodness-of-fit for three data sets, respectively.Based on the values of KS and KSP listed in Table 15, we conclude that the seven estimation methods perform well in fitting the three data sets.The P-P plots and histogram of three data sets with the fitted ExMED density for various estimation methods are, respectively, shown in Figs 5-7 that support the results in Table 15.

Conclusion
A new family of life distributions called the extended Modi-G family is presented, and general expressions for some mathematical statistics properties of the new family, including quantile function, moments, moment generating function, incomplete moments, inequality curves, Re ´nyi entropy, and Shannon entropy are derived.The maximum likelihood, ordinary least square, weighted least square, Anderson Darling, right-tailed Anderson-Darling, Crame ´r-von Mises, and maximum product of spacing methods were discussed to estimate the model parameters.A special sub-model called extended Modi exponential distribution was derived.Its density function can be symmetric, left-skewed, right-skewed, increasing, J-shape, and inverse J-shape along with upside-down bathtub, decreasing, decreasing-constant, increasing, and increasing constant hazard rate functions.Different data sets were analyzed, and the superiority of the extended Modi exponential distribution for fitting data sets was illustrated over other compared distributions.

Fig 5 .Fig 6 .Fig 7 .
Fig 5.The P-P plots and histogram of data set I with the fitted ExMED PDFs for all estimation methods.https://doi.org/10.1371/journal.pone.0308094.g005 Fig 8 provides the TTT plots and plots of the estimates HRF of ExMED for the three data sets, respectively.They reveal that the ExME HRFs have unimodal shapes respectively.This fact agrees with the TTT plot based on each data set.The proposed model's estimated parameters' existence and uniqueness are shown graphically in Fig 9 for the three data sets.These estimated parameters were calculated using

Table 5 . Partial and overall ranks of all the methods of estimation of ExMED by various values of α, θ and b.
https://doi.org/10.1371/journal.pone.0308094.t005

Table 13 . Discrimination measures of the ExMED model and other competing models for the third data set.
performance ordering of estimators from best to worst for all parameter combinations is MPSE, MLE, ADE, WLSE, RTADE, CVME, and LSE.
Each patient had their tumor removed by surgery at the Department of Plastic Surgery, University Hospital of Odense, Denmark, from 1962 to 1977.It consists of 7 variables, each with 205 observations; we studied the sixth variable (thickness: Tumour thickness in mm).It was obtained from Andersen et al. [25].The third data set gives peak accelerations measured at various observation stations for 23 earthquakes in California and is referred to in [26] by Joyner and Boore.It consists of 5 variables and each variable consists of 182 observations, we studied the fourth variable [dist: numeric Station-hypocenter distance (km)].The numerical values of data sets are given in Tables 6-8, respectively.

Table 15 . The estimates and negative log-likelihood function of the ExMED along with goodness-of-fit measures for the three data sets.
https://doi.org/10.1371/journal.pone.0308094.t015